Entry Name:  "IIITH-Gupta-MC1"

VAST Challenge 2017
Mini-Challenge 1

 

 

Team Members:

Ayushi Gupta, International Institute Of Information Technology, Hyderabad ayushi.gupta@research.iiit.ac.in     PRIMARY
Raghavendra Ch, International Institute Of Information Technology, Hyderabad
raghavendra.ch@research.iiit.ac.in
Kamalakar Karlapalem, International Institute Of Information Technology, Hyderabad , kamal@iiit.ac.in [ Advisor ]

Student Team:  YES

 

Tools Used:

D3.js

Plot.ly

Flask(Web framework)

 

Approximately how many hours were spent working on this submission in total?

200 hrs

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2017 is complete? YES

 

Video

Video is attached with submission

https://www.youtube.com/watch?v=Vb6iIM_Btsk

 

 

 

Questions

1“Patterns of Life” analyses depend on recognizing repeating patterns of activities by individuals or groups. Describe up to six daily patterns of life by vehicles traveling through and within the park. Characterize the patterns by describing the kinds of vehicles participating, their spatial activities (where do they go?), their temporal activities (when does the pattern happen?), and provide a hypothesis of what the pattern represents (for example, if I drove to a coffee house every morning, but did not stay for long, you might hypothesize I’m getting coffee “to-go”). Please limit your answer to six images and 500 words.


ANSWER1: To begin with the analysis of Lekagul sensor data, we first computed the total number of unique vehicles of each type ( cartype1 - 7487, cartype2 - 4717, cartype2P - 998, cartype3 - 3039, cartype4 - 1244, cartype5 - 817, cartype6 - 406). From this, we can infer that cartype1,cartype2, cartype3 are more in number than any other vehicle type. We categorize the repetitive patterns into two parts. The first part deals with individual car-id's and the second part deals with groups. By groups, we mean similar vehicle types or vehicles following a similar route with similar time.

PART1: INDIVIDUALS
1) In the given dataset, there are a total of 18708 unique vehicles (cartype1 - 7487, cartype2 - 4717, cartype2P - 998, cartype3 - 3039, cartype4 - 1244, cartype5 - 817, cartype6 - 406). We then found the vehicles entering the park more than once. There are a total of 5 vehicles entering more than twice in the park (6 vehicles entered twice).
     a) 20154519024544-322 (cartype 2) (16 times) : This car-id is the one which entered the maximum number of times. It visited the preserve from June to October 16 times( June : 2 times , July : 5 times, August : 4 times, September : 4 times, October : 1 ). In each and every visit, this vehicle always followed same path. It always entered the camping location in the afternoon (around 15:00) and left the very next day at midnight and exited the preserve straightaway from there within 1 hour. The path traversed by this vehicle is shown in figure1.1.[In the figure shown, the thickness of the road is proportional to the traffic following through that route, blue path indicates the route which vehicle followed and orange color indicates road network within park.]

Figure 1.1 : Path followed by 20154519024544-322

b) 20153712013720-181 (cartype 3) (4 times): This car-id always visited camping6 and followed the same path always. It always enters the camping location around 14:00 in afternoon and leaves the next day around 22:30. [figure1.2]
Figure 1.2 : Path followed by 20153712013720-181, 20162904122951-717 and 20162027042012-940.

     c) 20162904122951-717 and 20162027042012-940 vehicles also visited the preserve multiple times and visited one of the camping locations.



PART2: GROUPS
a) Finding the groups within the data requires an analysis of the vehicle type and the time spent by the vehicles on a route, the time spent in camping and the most common path followed by each vehicle type. For this part, we first draw a heat map where time is on the x-axis and the gate-name is on the y -axis [figure 1.3]. In this visual representation, the block that is red in color indicates a very high flow of traffic. This depicts that general-gate7 is the most heavily used gate from 7:00AM to 14:00PM along with general-gate2, general-gate1, ranger-stop0 and general-gate0.
Figure 1.3 : Traffic flowing through each gate per hour.
b) Next we analysed the duration of stay at all the camping locations. The total camping visitors are 10063 (of which 9887 are unique). We can categorize the camping visitors into 4 parts. The ‘less than 3 days’ visitors are huge in number -approximately 8000, the ‘4-9 days’ visitors are 1800; the rest are ‘10-17 days’ visitors (277) and ‘18-35 days’ visitors (32). This implies that the general visitors’ duration of stay at the camping location is ideally less than 3 days but a few percentage of visitors opt for a greater duration i.e. between 4 - 10 days. The ones who are staying for 10-17 days all visited the preserve only once (not more than once). Figure 1.4 shows the distribution of the duration of stay at all the camping locations.
Figure 1.4 : Distribution of duration of stay at camping location.
c) Figure 1.5 shows the traffic through each gate at a given date. From this, we observed that all the entrance gates are equally visited with ranger-stop0 and ranger-stop2 that have more number of check-ins/check-outs than any other ranger-stop.
Figure 1.5 : Datewise total number of visitors to each gate

2Patterns of Life analyses may also depend on understanding what patterns appear over longer periods of time (in this case, over multiple days). Describe up to six patterns of life that occur over multiple days (including across the entire data set) by vehicles traveling through and within the park. Characterize the patterns by describing the kinds of vehicles participating, their spatial activities (where do they go?), their temporal activities (when does the pattern happen?), and provide a hypothesis of what the pattern represents (for example, many vehicles showing up at the same location each Saturday at the same time may suggest some activity occurring there each Saturday). Please limit your answer to six images and 500 words.

ANSWER2:
1) Figure 2.1 shows the count of vehicles in the preserve on a given date. June, July, August and September are the peak months and vehicle type 1,2 and 3 are more active (also more in number) compared to other vehicles. From figure 2.2 (count of vehicles check-in per month per gate) we can figure out that general-gate1 ,general-gate2, general-gate4, general-gate5, general-gate7, ranger-stop0, ranger-stop2 are the gates that are being visited maximum.

Figure 2.1(a) : Count of vehicles inside the preserve at a given time
Figure 2.1(b) : Count of vehicles visiting a gate in a month

2) Vehicle type 5,6 are visiting a list of the most restrictive gates i.e., either of the 4 entrances- general-gate1,general-gate2,general-gate4,general-gate5, general-gate7,ranger-stop0,ranger-stop2. Figure 2.2 shows the union of all the paths being followed by 5,6 vehicle type. Figure 2.3 shows the count of vehicles on each path. There are 2061 cars of type 6 and more than 50% (1232) are going via the general-gate1 , ranger-stop0, ranger-stop2, general-gate0. Also, there are no repetitive visitors for car-type5,6.
Figure 2.2: General route followed by vehicles of type 5,6
Figure 2.3: route between two gates and their count (for vehicles of type 5, 6)

There are 2061 car's of type 6 and more than 50% (1232) are going going via general-gate1 , ranger-stop0, ranger-stop2, general-gate0. Also, there are no repeatitive visitors for car-type5,6.

3)The most common path followed by 2P vehicles is : ('ranger-base', 'gate8', 'general-gate5', 'gate3', 'ranger-stop3', 'ranger-stop3', 'gate3', 'camping8', 'general-gate3', 'gate4', 'ranger-stop5', 'ranger-stop5', 'gate4', 'gate5', 'ranger-stop6', 'ranger-stop6', 'gate5', 'gate8', 'ranger-base')

4) The next attempt is to find the patterns that we used to filter the dataset based on holidays ['2015-05-25', '2015-07-03', '2015-09-07', '2015-10-12', '2015-11-11', '2015-11-26', '2015-12-25', '2016-01-01','2016-01-18', '2016-02-15', '2016-05-30']. These are the sets of holidays we took based on the US calendar. On filtering the dataset for 2P type vehicles, we found that there are a few vehicles which run exclusively on those days. These vehicles (total 34) travel only on one day, once in a year and only on holidays.

5) The case is similar with vehicles of type 4,5 and 6. On 3rd july, there were comparatively more number of visitors for vehicles of type 1,2 and 3. Most of them were going for camping on that day.

6) Based on each path count we can say that most of the vehicles are following the route: general-gate1 , ranger-stop0, ranger-stop2, general-gate2'

3Unusual patterns may be patterns of activity that changes from an established pattern, or are just difficult to explain from what you know of a situation. Describe up to six unusual patterns (either single day or multiple days) and highlight why you find them unusual. Please limit your answer to six images and 500 words.

1) Cartype4 vehicles are the only vehicles which visited gate3, gate5, gate6,ranger-stop3 and ranger-stop6 apart from 2P [Figure 3.2]. This was observed by using the following heatmap[figure 3.1].
Figure 3.1 : gate vs Car type heatmap
Figure 3.2 : Path followed by cartype4 visiting ranger-stop6
From this pattern, we can say that they have been visiting ranger-stop3 and ranger-stop5 regularly.
2) The car with id '20155705025759-63' (cartype 3) entered the park on may/2015 and is roaming inside the park since then and has not exited yet. It is the only vehicle in the park which is not of type '2P' and hasn’t exited [figure 3.1].
Figure 3.1 : Vehicle roaming within park round the year
Figure 3.2 : Vehicle's bypassing gate2 and going directly to ranger-stop1 from entrance1

3) Another unusual pattern we found is the car-ids: 20152810102803-808 , 20152810102819-458 , 20152810102828-459 20152910102928-970 , 20152910102959-782 , 20153010103017-871 (of cartype1) are the only cars that visited Rangerstop1 apart from 2P and that too on the same day [Figure 3.1]. Also, all these vehicles followed the path entrance1 to ranger-stop1 to entrance1, bypassing Gate2 always, on a single day. That is, Gate2 has missing entry for these particular vehicles. There seems to be some malicious activity going on at ranger-stop1 once every year (around 2015-07-10).[figure 3.2]

4)20155320015318-618 and 20154916084939-496 vehicles of 2P type are the only vehicles of 2P type which visited camping1 and general-gate0 5) 20161008061012-639, 20160623090611-424,20150322080300-861,20153427103455-30,20150204100226-134,20154501084537-684, : cartype 4 followed the same pattern i.e., they entered and exited in one minute and then visited again and stayed for 1 hour. All visited the general-gate1.

6) 20150827060825-381,20162811042816-300 vehicles are of 2P type and they traveled only on 1 day and are the only vehicles visiting from one camping location to another. This was being observed using figure 3.3 where each entry shows the vehicle going from one location to another.

Figure 3.3 : Count of Vehicles going from one location to another

4What are the top 3 patterns you discovered that you suspect could be most impactful to bird life in the nature preserve? (Short text answer)

The top three patterns that we found are :
1) The car with id '20155705025759-63' (cartype 3) seemed to be the most suspicious as it entered the park on 2015-06-05 and has been roaming inside the park since then. It's going from one camping location to another and has visited only a few set of gates (all the general-gates, camping and ranger-stop). The park security needs to check if there is any visitor still in the park staying there for longer duration. This vehicle duration of stay at camping location even goes upto 1 month.
2) From the dataset, we inferred that ranger-stop1 is meant to be visited only by 2P vehicles. However, on one particular day, (2015-07-10) there were a few set of vehicles (20152810102803-808 , 20152810102819-458 , 20152810102828-459 20152910102928-970 , 20152910102959-782 , 20153010103017-871 of cartype1) which visited ranger-stop1 and didn't check in gate2.
3) There is heavy traffic through the gates : 'general-gate7','general-gate2','general-gate1' and 'ranger-stop2, throughout the year and during the peak months in particular. Cartype 5 and 6, are going to certain set of gates only and they visited the preserve only once. Cartype 1,2 and 3 are the vehicles which are more in number than others and follow few routes only. Also we found that there are few set of vehicles of type 2P which only traversed on holidays based on US calender.